-
Notifications
You must be signed in to change notification settings - Fork 1.9k
fix(buffers): fix panic in disk buffer when dealing with corrupted file #23617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ec0516a to
e403542
Compare
pront
reviewed
Sep 10, 2025
pront
reviewed
Sep 16, 2025
pront
approved these changes
Sep 16, 2025
vparfonov
pushed a commit
to vparfonov/vector
that referenced
this pull request
Oct 21, 2025
…le (vectordotdev#23617) * fix panic in disk buffer when dealing with corrupted file * Allow clippy too many lines in test * cargo fmt * simplify test * Update changelog.d/disk_buffer_panic_if_corrupted_file.fix.md --------- Co-authored-by: Thomas <[email protected]> Co-authored-by: Pavlos Rontidis <[email protected]>
vparfonov
pushed a commit
to vparfonov/vector
that referenced
this pull request
Oct 21, 2025
…le (vectordotdev#23617) * fix panic in disk buffer when dealing with corrupted file * Allow clippy too many lines in test * cargo fmt * simplify test * Update changelog.d/disk_buffer_panic_if_corrupted_file.fix.md --------- Co-authored-by: Thomas <[email protected]> Co-authored-by: Pavlos Rontidis <[email protected]>
vparfonov
pushed a commit
to vparfonov/vector
that referenced
this pull request
Oct 21, 2025
…le (vectordotdev#23617) --------- Co-authored-by: Thomas <[email protected]> Co-authored-by: Pavlos Rontidis <[email protected]>
openshift-merge-bot bot
pushed a commit
to ViaQ/vector
that referenced
this pull request
Oct 21, 2025
…le (vectordotdev#23617) --------- Co-authored-by: Thomas <[email protected]> Co-authored-by: Pavlos Rontidis <[email protected]>
openshift-merge-bot bot
pushed a commit
to ViaQ/vector
that referenced
this pull request
Oct 24, 2025
…le (vectordotdev#23617) * fix panic in disk buffer when dealing with corrupted file * Allow clippy too many lines in test * cargo fmt * simplify test * Update changelog.d/disk_buffer_panic_if_corrupted_file.fix.md --------- Co-authored-by: Thomas <[email protected]> Co-authored-by: Pavlos Rontidis <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
at startup writer checks the last written file and if it is corrupted it sets a flag to skip to next file but current write file id in ledger is not update and file is not created.
when reader hit this corrupted file it roll over to next file to read and updates readable file id. at this point reader file id can be greater than writer file id if reader was done with last file. from
seek_to_next_recordin readerat this point both reader, writer and ledger is initialized and expectation would be that any read should block until writer writes new file and data.
but any read at this point keep increasing next file id to read and loop over to current file which was already read. this cause panic at various places.
fix is to wait for writer to create file if reader_file id is current or next writer file id. writer might end up creating anyone of them based on skip flag.
sequence of logs may also explain what happens when we hit this issue.
reader hit bad file and rollover to next file.
a subsequent read goes through this loop and keep incrementing reader file id until wraps around and opens same file again for reading. this creates condition where new record id is lower than previous record id and gives impression of huge records being skipped also total buffer in legder will also be decreased and may hit less than 0 causing panic.
Vector configuration
How did you test this PR?
added unit test which reproduce panic if fix is removed.
Change Type
Is this a breaking change?
Does this PR include user facing changes?
no-changeloglabel to this PR.References
Notes
@vectordotdev/vectorto reach out to us regarding this PR.pre-pushhook, please see this template.cargo fmt --allcargo clippy --workspace --all-targets -- -D warningscargo nextest run --workspace(alternatively, you can runcargo test --all)git merge origin masterandgit push.Cargo.lock), pleaserun
cargo vdev build licensesto regenerate the license inventory and commit the changes (if any). More details here.